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Abstract 

In this paper we extend our previous investigations on systemic features of the immune system, 
based on a statistical mechanics approach. In particular, we recently introduced a mean-field spin- 
glass model for the interaction between helper cells and the effector branches (B and K cells) able 
to reproduce, as emerging properties, several collective phenomena shown in real immune networks 
(e.g. the connection between autoimmunity and lymphoproliferative disorders or the breakdown of 
immunosurveillance by diminishing the amount of helpers in the system). Here, we go beyond the 
previous fully-connected approximation by introducing dilution in the interactions between helpers 
and B clones, and show that this makes the former able to orchestrate parallel strategies to fight 
several pathogens simultaneously. This is an important step forward toward a comprehension of 
these systems since dilution, which is a biological requisite, results in multitasking capabilities. The 
latter are indeed the core of the immune system as always multiple attacks are present in a host. 
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1 Introduction 



While the first half of the XIX century saw triumphal discoveries in physics, ranging from quantum 
u mechanics and general relativity up to the discovery of chaos, the second half has been probably dragged 

by biology: Among its several fields of investigation, immunology (both theoretical and experimental) 
is currently one of the most intensive and promising. (Adaptive) Immune systems are networks of 
I lymphocytes exchanging chemical signals and proteins as cytokines or antibodies. In a nutshell, the main 

constituents of the immune system are the effector branches (made of by B cells and Killer cells) and 
the Helper cells [TJ. B cells produce antibodies and are grouped into clones: each cell belonging to a 
r- — i given clone produces the same antibody, while different clones produce different antibodies. When a 

pathogen, e.g. a virus, enters the body, the clone producing the best matching antibody undergoes clonal 
expansion and secretes the antibody able to chemically bind the pathogen hence (possibly) avoiding 
the propagation of the infection. If the virus has already infected a host cell, the latter is killed (e.g. 
via lysis) by Killer cells and order is restored. Actually, responses by effector branches can take place 
only if another signal (beyond the presence of the antigen) occurs. This signal is promoted by helpers 
which coordinate/modulate the response by exchanging with the effector branches both eliciting (e.g. 
the interleukin-4 cytokine) and suppressive messages (e.g. the inter leukin- 10 cytokine). Given the large 
t— I amount of its constituents (e.g. the complete B-repertoire in humans is estimated to range in 10 8 — 10 10 

clones) and the interest in understanding global 'collective" features of the immune system thought of 'as 
a whole" , scientists are becoming attracted towards the potentiality of statistical-mechanics approaches 
even in this branch of theoretical biology (see for instance [El E21 EH1 ES])- However, a theoretical 
motivation for the use of canonical disordered statistical mechanics in biological world is not so obvious 
as biological systems are not isolated, and the blind application of the second law of thermodynamics 
locally may be debatable. One can argue a sort of "local-reductionism" : despite within the host all 
networks (e.g. neural, endocrine, immune, etc) interact synergically, as a first approximation they can 
be considered independently (as in the approach paved in neural networks [5] ) and the cells building the 
system retain only their peculiar function (as for instance firing/not-firing a spike in the neural scenario, 
or secreting/not-secreting antibodies for the B-cells in the present context). Furthermore, one may 
escape theoretical justification showing accordance with data and noticing that the Maxwell-Boltzmann 
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distribution (at least for linear forces, namely quadratic Hamiltonians) coincides with the prediction of 
the maximum entropy models in statistical inference [34, 35 , widely used for fitting procedure without 
paying attention to the whole scaffold offered by statistical mechanics. 

With this premise we continue the investigation of a model introduced in [4] and meant to capture, as 
emergent features, some collective properties of the immune system. The model is based on spin-glasses 
|37| and their equivalence with information processing systems |17) as Bolzmann machines and Neural 
networks. More precisely, the interactions between helper cells and B-cells via cytokines (which are both 
eliciting or suppressive) is shown to give rise to an effective Hebbian structure among helpers alone, in 
which they are able to develop "strategies" to coordinate properly the effector branches [4] . Interestingly, 
bypassing the previous mean-field approximation, where each helper interacts with each B-clone, toward 
a description where each helper interacts only with a fraction of the available repertoire of B-cells, which 
is a biological mustQ makes the "cognitive capabilities" of the helpers able to manage parallel processing. 
This means that helpers succeed in performing multiple retrievals (i.e., strategies, instructions for the 
B-cells) at the same time, without falling into spurious states (i.e. errors) typical of the underlying glassy 
nature of neural networks. This is an important step forward toward a rationale understanding of the 
systemic properties of immune networks. 

The paper is organized as follows. In section 2 we review the minimal, fully-connected model previously 
introduced in 4_, while in Sec. 3 we explain how dilution is introduced and we scaffold the statistical- 
mechanics analysis, then, in Sec. 4 we study in details the parallel retrieval performed by the system and 
in Sec. 5 we give some insights in the numerical methods exploited; finally, Sec. 6 is left for discussions 
on results and on future perspectives. 



2 The minimal model: features and limitations 

In this section we briefly review the minimal model we introduced in [4] for the core of the immune 
system. As in the original model there is full symmetry between the two effector branches B and K, in 
the following we consider only the B-branch. 

At equilibrium, when no external antigens are present and neglecting second order subtleties as auto- 
activation due to Jerne cascades [23 001 E] (see also Sec. 3.1), the system is built by H different helper 
clones and B different B-cell clones. Each B-cell clone, say the i th , is associated to a degree of activation, 
referred to as hi, which represents the extent of the clonal size with respect to a given equilibrium value. 
Setting, without loss of generality, the equilibrium value equal to zero for all clones i 6 (1, B), we have 
that bi > (hi < 0) means that the clone has expanded (shrunk). Due to fluctuations, memory effects 
and noise, overall the distribution of the clonal activations for the effector branches P(bi) is assumed as 
a central Gaussian Af{0, 1], in agreement with experimental results |36) . 

Interactions among B-cells (which are not treated here) exist too and, roughly speaking, work through 
a complementary affinity exchange such that if a B-cell (belonging to a clonal lineage) produces the 
antibody 10000 in some epitopal alphabet (see for instance the phase shape of j3H H3] or the interaction 
matrix of JSJ IS] ) , its antibody can be recognized by the cells (clonal lineage) producing the anti- antibody 
01111 or very similar variants. As a consequence, cells belonging to the same clone do not interact 
directly and there is no intra-clonal cooperative response to an external stimulus such as an antigenic 
attack. This would lead to a Michaelis-Menten 50 [57] behavior of the B-cell response. On the other 
hand, the overall bell-shaped response [T] is the result of extra-clonal cooperative interactions and of 
interactions with helper cells (through cytokines). 

As for helpers, the situation is quite different because they interact each others via cytokine signalling 
[36j|49] which is non-specific, so that intra-clonal response can be highly cooperativ^] Thus, the activity 
hi can be better approximated by a step function (a steep hyperbolic tangent), which can be seen as 
a digital switch w.r.t the B kinetics. Consequently, we define their status (active/inactive) as a binary 
Ising spin such that if hi = —1 the i th clone is quiescent, while if hi — +1 it is firing, namely secreting 
cytokine^] 

lr The reason is that the interaction between B cells and helper cells is essentially local: on the one hand B cells act 
on helpers as antigen presenting cells and this requires MHC-II:antigen to bind properly with the TCR available from the 
helper ensemble, on the other hand, helper cells act on B cells via cytokine exchanges which are diffusive and short-ranged 

2 In particular biochemistries quantify the property of cooperation by a direct measure of the Hill coefficient C |27 ] |21 | . 
such that for its high values, e.g. C > 4 there is a strong sigmoidal shape in the response: The system shows unresponsiveness 
to small stimuli while it is maximally responding once the stimulus reaches a threshold and stay stable beyond. 

3 We stress that, as simplifying assumptions, we neglected details about the subclasses T/^jT^ |36| which would clarify 
the cytokine loops inside helpers and we neglected further the "hierarchical" strength of various cytokines |33| (e.g. inter- 
ferons usually induce stronger responses w.r.t. interleukins) assuming all the chemical signals as equivalent and loosing a 
more complex alphabet for chemical messengers. 
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As cytokines may have either positive (expansion) and negative (suppression) signs, we model the 
system of B and H interactions as a bipartite spin glass admitting the Hamiltonian representation^] 



H,B 



H 

i, ft 



where £f is the cytokine connecting the i th helper clone with the [i th B-clone Hence, overall, H x B 
cytokines have to be defined. 

The partition function Zh.b{P) of this system can be written as 



{h} J M=l 



2 / H B \ 2 

= E ex P ^EE gtiWi = £ eM-f3 2 n(h; a (2) 
{h} \ ij p=i / {h} 

The parameter /3 2 (in the following rescaled into (3) rules the level of noise in the network, while the ratio 
a = limjj^oo B/H accounts for the difference in clonal sizes between helpers and B-cclls, in the limit of 
large sizes. 

Interestingly, the complex interactions between helpers and B-cells are absorbed, via marginalization, 
within a two-body Hamiltonian H(/i;^), which turns out to be equivalent to the Hamiltonian of the 
Hopfield model which functions as an associative memory [5j. Here, neurons are replaced by helper 
lymphocytes and the memorized "patterns of information" correspond to particular strategies, encoded 
by cytokine secretions, for the B-cells. In the following we will refer to the generic element i as lymphocyte, 
spin or neuron, indifferently. 

As well known from neural network theory [5], this system may show cooperative cognitive features 
only if a < a c , where a c < 1 is a critical value implicitly offering the first global constraint for a 
correct performance of the immune system: Helpers must be more than B-cells; this is indeed confirmed 
experimentally pQ. 

Let us now see how the system responds to an external stimulus in the adiabatic limit. For simplicity 
we consider the case where all clones are made to expand; this peculiar situation actually corresponds to 
a real disease, known as Autoimmune Lymphoproliferative Syndrome (see for instance |47j). Therefore, 
in the partition function Zh,b{<^, P), we substitute the centered Gaussian weight exp(— 53u^/2) with 
ex P[ — — ^o) 2 /2], fro being the new reference value of clonal expansion. It is immediate to check 

[4] that when the bipartite spin-glass system is mapped into the neural network counterpart, a new term 
appears in the resulting (pseudo)-Hamiltonian T-L{h;^): 



m{h- o = A EE + Vtfto E ( 3 ) 



H B H 



2H 

with 7/ G A/"[0, 1]. Thus, a clonal expansion, which can be thought of as ordered work, implies a produc- 
tion of "heat", thought of as a random field (a noise source), which affects helpers performance]^] As a 
consequence, too strong responses (lymphocytosis) can be coupled with (transitive) autoimmune mani- 
festations because the random fields prevail over retrieval; this offers a simple theoretical explanation to 
a common fact. 

Therefore, the tunable parameters of the system are /3, a and 6o; according to their mutual values 
the system described by H(h;(,) exhibits different behaviors (see Fig. 1). In the neural scenario, one 
can outline a region in the parameter space where the system is able to retrieve a given learnt bit-string 



4 Note that this Hamiltonian shares the same mathematical structure of the cost functions of the Boltzmann learning 
machine 1 141 . whose thermodynamical equivalence with neural networks has been deepened in |10| . 

5 The interaction between a B cell and a T cell is rather complex as it requires first that B-cells present, through their 
major histocompatibility complex, a peptide processed from the antigen previously detected, and this that must bind 
enough to the corresponding TCR of the helper [J. Then, T-cells secretes cytokines directed to B cells themselves. While 
biologists distinguish between "helper" T-cells (secreting stimulatory cytokines) and "regulators" or "suppressors" to T-cells 
(secreting inhibitory cytokines), here we simply refer to helper cells. Indeed, the experimental research on details of these 
interactions is still ongoing. Along the same line, we adopt symmetric distributions for cytokines with a spirit close to the 
neural network counterpart . The framework depicted is therefore meant to describe the behavior of the system after the 
antigen presentation by B cells, but prior to the possible, consequent clonal expansion. 

6 Note that the noise level j3 appears linearly in the "bulk" term, while under the square root in the noise term, mapping 
standard propagation of information in the former and random diffusion in the latter. 



3 




Spin-Glass 
Chronic Infection 

Figure 1: Schematic representation of the working (retrieval) region. The three normal ways to escape 
have been depicted: Random Field (too large or too low mean activity), Spin-Glass (too large relative 
number of patterns) and Paramagnetic (too large degree of noise). Such states correspond to unhealthy 
situations, namely lymphocytosis, chronic infections and senescence, respectively. The working region is 
restricted, by definition, to the quadrant a > 0, j3 > 0. 



of information, which corresponds to a particular pattern £ M . How does this concept translate, in the 
immunological context? 

This can be understood by introducing the B Mattis magnetizations — jj £?hi, one for each 
B-clone. If, for example, the pattern of cytokine activation /i = 1 has been retrieved, then uii = 1, which 
means that all helper states are parallel to the corresponding cytokines linking them to the first B-clone. 
As a consequence, all the inhibitor signals are absent while all the eliciting signals are present, namely, the 
assembly of helpers orchestrates the response against the antigen coupled to the first B-clone conferring 
to the latter the maximal strength for the clonal expansion. We stress that the Hamiltonian (3) shows 
gauge invariance (breakable with an external field [5] , usually thought of as a threshold for firing in neural 
counterpart), which is lost in the bi-layered B-H network. In other words, the marginalizations performed 
in Eqs. (2) and (11) break the symmetry hi — > —hi. In fact, the H-H network is left invariant under 
the transformation hi — > —hi for all i G (1, ...,H), while in the B-H network, having fixed the cytokine 
pattern, the same transformation leads to a change in the field felt by the connected B clones. 

We finally underline that, within a pure fully-connected approach, each helper clone interacts with 
the whole B-repertoire, and the assembly for deciding about one single B-clone includes the whole helper 
ensemble, which are both unrealistic fatures given the huge sizes of such populations and the fact that 
interactions are essentially local and of diffusive nature. In what follows we remove the hypothesis of a 
fully-connected by-layered spin-glass network and we allow only a (small) fraction of the whole ensemble 
of helpers to coordinate the response of a given B-clone. As we will show, the resulting associative 
network made by helpers is able to parallel process, namely to manage several B-clones together at the 
same time, which is an emergent property much closer to the biological reality. 

3 Getting closer to biology: Dilution in the B-H interactions 

3.1 The refined model and its robustness 

The purposes of this section is to generalize the original model in two ways. 

First, we show that, even including intra-clonal cooperativity among B-cells (hence for dichotomic 
variables bi = ±1) we can still retain, under a suitable expansion, a resulting helper network working as 
an associative working memory. This extension allows to incorporate even second order effects of clonal 
interactions in the B-repertoire previously neglected. For instance, a virus, e.g. represented by the string 
of information 0111, yields the clonal expansion of the lineage of B-cells producing the antibody 1000, 
and this, in turn, will drive the clonal expansion of the lineage of the B-cells producing the anti-antibody 
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0111; the latter, resembling (for receptor binding) the original virus, could elicit further the expansion 
of the clone 1000, resulting as a second-order cooperative feedback by the cells within the original clone. 
Therefore, the Gaussian variables for B-cells are replaced by dichotomic ones and we assume here that 
both H lymphocytes hi and B lymphocytes b^ take discrete values ±1, where —1 stands for quiescence 
while +1 for a firing state (secretion of cytokines -for the H- or secretion of antibodies -for the B-); in 
particular, B clones are subjected to the following probability distribution: 



P(b IM ) = P5 (b ^ 1) + (l-P)6 (bft+1) , 



(4) 



where 6 X is the Kronocker delta function and P £ [0, 1], such that for values of P 7^ 1/2, it acts as a bias 
mimicking the original bo parameter. 

Then, we show how, by properly diluting the bipartite spin-glass by-layer of B and H cells, the 
associative helper network becomes able to handle parallel recall of several different patterns (strategies) . 
We introduce the Hamiltonian of this new by-layer in complete analogy with the previous section as 



B H 



H(M;0 = -^EE^A- 



(5) 



fi — 1 i—l 



When dilution is absent, free energy minimization shows that this Hamiltonian, for orthogonal patterns 
£M • £ w = H ■ S(fi — v) (which results from uncorrelated distributions in the thermodynamic limit H — > 00 ) , 
tends to expand the fi th B-clone (i.e. > 0), but it provides the other clones with no net information (i.e. 
m u ^^ = 0). Conversely, real immune systems are able to address a wide variety of antigens simultaneously 
managing several clones at the same time and, in this sense, we refer to parallel processing capabilities of 
the network. This property can be restated as the ability to have equilibrium states with several Mattis 
magnetizations different from zero, or above the noise level at finite volume, without being spurious states 
0. 



We introduce dilution in the couplings, by writing: 

Si/Li ~ ^-if-t 



(6) 



where assumes values ±1, representing the excitatory or inhibitory quality of the link (cytokine), 
and Ci^ assumes values 1 or representing, respectively, existence (1) or absence (0) of the link. Their 



probability distribution are: 



PM = 
PM = 



(7) 
(8) 



where d can range continuously in [0, 1], allowing some intensive tuning Hence, we get the following 
distribution for ^ M : 



P(0 = P(ci^) 



(1-d) 



'(«f-i) 



(1-d) 



dS^' , 



(9) 



such that for d — s- 1 no network exists, while for d — > the Hopfield model is recovered. 
We can now proceed to the calculation of the partition function Zh,b(c(, (3, d): 



2" 2 



^^exp[-/3ft(M;0]- 

{h} {b} 



Z H , B W,d) 

Summing the partition function over the b^ distribution (Eq. (|4| 

2 h b r 





H 



exp 



H 

E 



(10) 



(11) 



7 We stress that the assumption of symmetry for cytokine distribution (see Eq. |8|) can be easily relaxed leading to a 
network with low level of activation and whose properties strongly resemble the standard ones 6 . 
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Expanding, in the last line in Eq.( 11 ), the logarithm of the hyperbolic sine at the second order we get 



z H>B (p, d) = n cx p ( 2P - E h ^ + 1 1 - ( 2P - !) 2 ]^ E \ ■ (") 

{/i} M=l [ V i=l t,j 

We can split the contribution of the field ipi insisting on the generic hi lymphocyte in two terms: (fi = 

ipf xt + (p int : 

v r = (2P- i)-4Etf. ^ s [i-(2i , -i) J ]^EE^ 

The latter results from the effective two-body interactions among helpers and preserves the Hebbian 
structure necessary for the retrieval. Conversely, the former, for sufficiently large B, by the central limit 
theorem (CLT), approaches a Gaussian variable and can be thought of as a term of noise. Hence, posing 
r\ = Af[0, 1], we can write 

B 

Etf = (13) 

Shifting (3 2 to fi to rescale noise we get 

f3H(h; e, 7?) = -[1 - (2P - l) 2 ]^: E E W " ( 2P " ^ ^ g~ ^ E ( 14 ) 

j,j i=l 

where the first and the second terms mirror, respectively, the associative network and the perturbing 
random field (due to poly-clonal activation) of Eq. Assuming for simplicity P = 1/2 so to start 

studying the system in the absence of external stimuli, we get the effective Hamiltonian 

if H B 

whose properties will be investigated in the rest of the paper. 
3.2 Notes about the coupling distribution 

As it is immediate to check, each missing link between the i th helper cell and the fi th B-cell in the 
bipartite B-H network appears as a (i.e. £f = 0) in the i th entry of the bit-string £ M in the equivalent 
associative network, which ultimately affects the interaction matrix J = Ju. Of course, the larger the 
degree of dilution, the stronger the difference between such (random) coupling matrix and its Hopfield 
counterpart. This section is devoted to the investigation of the properties of the matrix J. 

Let us consider a set of H nodes labeled as i — 1, ...,H and let us associate to each node a string of 
length B and built from the alphabet { — 1,0,1}, meaning that the generic element £f, with i g 
and fj, £ [1, -B], can equal either ±1 or 0. For the H-H network described by the Hamiltonian in Eq. ( |15| ), 
the interaction strength between two arbitrary nodes i and j is given by 

J(i = M. (16) 



Of course Jij £ [-B, B]. Equation ( 16 ) gives rise to a network of mutually and symmetrically interacting 
nodes, where a link between nodes i and j is drawn whenever they do interact directly (J^- ^ 0), either 
imitatively (Jy > 0) or anti-imitatively (Jy < 0). 

First, one can calculate the probability that two nodes (since they are arbitrary we will drop the 
indexes) in the H-H network are linked together, namely 

B 

P&M B) = P(J ^0;d,B) = l-P(J = 0;d,B)=l-J2 Pmm-o(fc; d, B), (17) 

k=0 

where P snm -o(k; d, B) is the probability that two strings display (an even number) k of non-null matchings 
summing up to zero; otherwise stated, there exist exactly k values of fi such that £f fi 1 ^ and they are 
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Figure 2: (Color on line) Contour plot for P(J = 0; d, B) representing the probability that the coupling 
between two arbitrary nodes is zero, as a function of B (linear scale) and d (logarithmic scale) . 



half positive and half negative. In particular, P SU m-o(0; d, B) — [d(2 — d)] B , because this is the probability 
that, for any /i £ [1, B], at least one entry (either or £^ or both) is equal to zero. More generally, 



P sum -o(k; d, B) = 



l-d 



2/,' 



[d(2 - d)} 



B-k 



\k/2 



(18) 



where the first and the second factors in the r.h.s. require that k entries are non-zero and the remaining 
B — k entries are zero; the third factor accounts for permutation between zero and non-zero entries, while 
the last term is the number of configurations leading to a null sum for non-null entries. Therefore, we 
have 



B r 



P(J = 0;d,P) = [d(2-d)] B ]T 



fe=0 



2d(2 - d) 



B\fk 
k J \k/2j' 



(19) 



whose contour is shown in Fig. [2j As for its asymptotic behavior, we distinguish the following cases (for 
simplicity we assume B finite and even): 



P{J = 0;d,B) 
P{J = 0;d,B) 



1 - 5(1 - df + -B{B - 1)(1 - df + 0(1 - df 



(-1) B/2 A 



T(l/2 - B)Y(l + B/2\ 



(l-2Pd)+0(d 2 



1 - 2Bd ( B 



4 b/2 y B / 2 



(20) 

0(d 2 ). (21) 



The average number of nearest neighbors per node {z)d,B,H follows immediately as (z)d,B,H = HP\ink{d, B). 

More generally, we can derive the coupling distribution P(J; d, B), once having defined P+i(k), P_x(fe) 
and Po(fe), as the probability that, given two strings, they display k matches each equal to +1, —1 and 
0, respectively, namely 



P+i(k;d) = P_ 1 (fc;d) 



(l-d) 2 



, P (fc;d) = [d(2-d)] A 



(22) 



Hence, we can write 



(B-J)/2 



B\ 



P(J;d,P) = J2 P +^ 1 + J ; d ) P -^ d ^ B - 2l - J > d) l\(l + J)\{B- 2l-J)\ (23) 
~ Af(0,aj(d,B)). 

The last asymptotic holds for large P; the null mean value (J)d,B — is due to the symmetry character- 
izing P(£f) (see Eq. 9), while the standard deviation is oj = a/ {J 2 )d,B = VB(1 — d). 

The full solution of the previous equation reads as P( J; d, B) = x J (f) [d(2-d)] B 2 Fi[-(B-J)/2,-(B- 
J — l)/2, 1 + J, Ax 2 ], where x = (1 — d) 2 /[2d(2 — d)]. An explicit, exact expression for this probability 
can be written for a particular value of d, by exploiting Gauss's Hypergeometric Theorem [13] . so that 
when Ax 2 — 1, corresponding to d = 1 — ^/2/2 ps 0.293, we have 



P(J;1 



2/2, B) 



2B 

B + J 



-J 2 /B 

PkB ' 



(24) 
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In the last passage we used the Stirling approximation assuming B±J large, namely that the distribution 
is peaked on non-extreme values of J. 

It is worth underlining that P(J; d, B) does not depend on the size H. Indeed, patterns are drawn 
independently and randomly so that the coupling Jij may be regarded as the distance covered by a 
random walk of length B and endowed with a waiting probability d(2 — d). Hence, the end-to-end 
distance is distributed normally around zero and with variance (mean squared distance) which is given 
by the diffusion law, namely ~ B. The possibility of the walker to stop simply reduces the effective walk 
length to [(1 — d)(2 — d)]B = (1 — d) 2 B in agreement with results above. 



3.3 Pattern dilution versus Topological dilution 

Dilution on pattern entries does not necessarily yield to a topological dilution for the associative network, 
but, as we will see, can induce non-trivial cooperative effects. On the other hand, a topological dilution 
can be realized by directly cutting the edges on a standard Hopfield network. In this section we highlight 
the deep difference between these two kinds of dilution. 

First, we recall that, according to a mean-field approach, the network is expected to display a giant 
component when the average link probability is larger than l/H. In the thermodynamic limit and 
assuming a large enough size B (stemming from either low, i.e. B ~ log H, or high, i.e. B ~ H, storage 



regimes) to ensure the result in Eq. (23) to hold, for any finite value of d the emergent graph turns out 



to be always over-percolated. In fact, Pu n k(d, B) = 1 — P(J — 0; d, B) ~ 1 — 1/-J2tWj, so that it suffices 
that (Tj > H/[V2tt(H - I)] -> l/V2iT and this leads to d < 1 - (27TB)" 1 / 2 -> I. 

On the other hand, when B is finite we can check the possible disconnection of the network by 



studying P(J = 0; d, B) from Eq.|20]and we get that fl ink (d, B) < l/H for d > I -1/VBH. Thus, in the 
thermodynamic limit, for any finite d, the graph is still overpercolated. Replacing l/H with (log H)/H, 
one also finds that the graph is even always connected. 

Different scenarios may emerge if we take d properly approaching to I as H is increased [2] . 

Another kind of dilution can be realized by directly cutting edges in the resulting associative network, 
as for instance early investigated in the neural scenario by Sompolinsky on the Erdos-Renyi graph [151 [5] 
or more recently by Coolen and coworkers on small worlds and scale- free structures [321 SI] • 

Such different ways of performing dilution - either on links of the associative network (see 48, 5 , 52"] FH]) 
or on pattern entries (see Eq. |4| - yield deeply different thermodynamic behaviors. To see this, let us 
consider the field insisting on each spin, namely for the generic i th spin ipi = $^y=i Jij&j, and analyze 
its distribution P(ip\d) at zero noise level. When dilution is realized on links (d is the fraction of links 
cut), only an average fraction d of the H available spins participates to <p, in such a way that both the 
peak and the span of the distribution decrease with d (Fig. 3, left). Conversely, when dilution is realized 
on the underlying bi-layer (d is the fraction of null entries in a pattern), as d > 0, P(cp\d) gets broader 
and peaked at smaller values of fields. The latter effect is due to the fact that couplings are, on average, 
of smaller magnitude. As for the former effect, we notice that, at f3, H and B fixed, when dilution is 
introduced in bit-strings, couplings are made uniformly weaker (this effect is analogous to a rise in the 
fast noise) so that the distribution of spin configurations, and consequently also P(ip\d), gets broader. 
At small values of dilution this effect dominates, while at larger values the overall reduction of coupling 
strengths prevails and fields get not only smaller but also more peaked (Fig. 3, right). 



4 Parallel processing performances 

4.1 Statistical mechanics of the low-storage case 

As a minimal bibliography in the statistical mechanics approach, we report that a different study sharing 
some similarities with our work, investigates an associative network with pattern inhibition (due to 
chemical modulation) in neuroscience scenario and has been performed in |18]|19j. while a macroscopic 
behavior close to parallel processing were already reported in |20j . where more than one magnetization 
were able to retain strictly positive values owing to strong pattern correlations (a completely different 
motivation) . 

Now, we solve the model in the regime B ~ log if, such that the limit a = lim^f^oo B/H = 0. Like 
in the Amit-Gutfreund-Sompolinsky (AGS) neural network [5], the comprehension of the non-saturated 
case is the first fundamental step to face before moving to the saturated case. This can be accomplished 
in several ways: We use the approach performed in |17j . 
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Figure 3: Left panel: Distribution of the field ip acting on the helpers with (Sompolinsky) dilution in the 
directed network [H-H] . Right panel: Distribution of the field ip acting on the helpers with (our) dilution 
in the bi-layered network [B-H] . 



Considering the pattern overlaps, also called generalized Mattis magnetization, 

H 

U 



4e^ ( 25 ) 



it is possible to rewrite the Hamiltonian ( 15 ) as 

«too = -§ J2 m l^ + l B > ( 26 ) 

where we emphasized that the Mattis magnetizations are functions of the helpers. 

These order parameters (properly averaged) are extremely useful to quantify the phases of the system as 
they are zero when the system displays no collective capabilities in orchestrating a response and differ 
from zero otherwise. 

We introduce three types of average: the Boltzmann average uj(m^) = J2h m v ex P( — ffti(h\ £))/Zh,b(P, d), 
the average E performed over the quenched disordered couplings £ and the global expectation Ew(m M ) 
defined by the brackets {m^). 

The equilibrium equations for the order parameter can be obtained from the quenched free energy 
(F(P,d)) 

(F(j3,d)) = - Km ^ElogZ H , B (0,d) = - lim ^Elog]T e ~^ h ^. (27) 

{h} 

Introducing the notation m = (mi, ...,ms) and £j = the above equation can be expressed in 
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terms of the density of state P(m) 



2 a 

D(m) = ^<5(m- m(ft)), (28) 
{h} 



as 



Z H . B (f3,d) = J rfmZ(m), Z(m) = e H0m2/2 V{m). 



Note further that the delta function here is a product of independent delta functions, once for each 
B-clone, namely: 

B 

5 (m - m (ft)) = JJ 5 (m p - m M (ft)) . 

We need now to introduce B integration variables x = [x\, ...,xb) to switch the delta functions to their 
integral representation as 

2?( m ) = (|Lj | ^iffx-m^g-iEfE?^*, = ^gj y dxe H(i X .m+(l n 2cos(x.e)) e)) 

where we assumed the property Yitcih^oo J2f = (/(£))?■ 

Physically speaking, the log-density of the states quantifies the constrained entropy S'(m) and can now 
be evaluated trough saddle point integration because of the factor H in the exponent of its integral 
representation above. Strictly speaking, we calculate only the leading term of the density of states, 
which however is the only retaining statistical meaning in the thermodynamic limit and is given by the 
maximum over x of S*(x, m), the latter being 

S(x, m) = ix • m + (ln2cos(x • £))$. 

It is then clear that the intensive quenched free energy can be rewritten as 

lim (F(B,d)/H) = ~]og2- lim / dmV(m)e^ Hm2 . (29) 

The main contribution to free energy can be made explicit as a finite-dimensional integral; as outlined be- 
fore for the constrained entropy, trough the extensively linearity property of thermodynamic observables, 
for large value of H the integral will be dominated by the saddle-point that maximizes the exponent as 

Km (F(p,d)/H) = - lim ^— / dmdxe" ffW(x ' m) = extr[/(x, m)l, (30) 

H^-oo H^oo Hp J 

/(x,m) = -^m 2 -ix-m- (log 2 cos[/3£ • x]) ? . (31) 

To identify the various ergodic components (which are expected to be B + l, one being the paramagnetic) 
we find the stationary points of /(m) trough the system 9 mjj /(m) = for all fi E (1, B) 1 which gives 
the vectorial self-consistence equations 

x = i/3m, im = (£ tan[£ • x])^. (32) 

Being the saddle point values of x purely imaginary, and using tanh(.x) = —i tan(zx) we get 

m= (£tanh[/?£-m])£. (33) 

Then, the above equation has to be averaged over the pattern distribution P(£f) and finally solved 
numerically, as explained for the following few example^] 

Before proceeding it is worth noticing that the Hamiltonian %(ft;£) of Eq. (26) is quadratic in the 
Mattis magnetizations and the B stored patterns contain (on average) a fraction d of null entries. As a 
consequence, the pure state ansatz (mi = l,m 2 = ... = TOb = 0) [5] can no longer work. In fact, now, 
the retrieval of a pattern (say £ 1 , the one coupled to mi) does not employ all the available helpers (and 
coherently mi is strictly smaller than one for d ^ 0) and those corresponding to null entries can be used 



8 Dcspite the structure of the self-consistencies for general B retrieved patterns are extremely simple both conceptually 
and analytically, they become, already for B > 3, of prohibitive length and handleable only via calculators. 
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Figure 4: Behavior of the two Mattis magnetizations mi and m 2 versus d at two (small) noise levels, 
namely = 10~ 4 (left panel) and = 0.05 (right panel). 



to recall other patterns. As the energy is a quadratic form in the Mattis magnetizations, its minimization 
requires that further patterns (up to the exhaustion of all available helpers) are recalled in a hierarchical 
fashion. More precisely, the (thermodynamical and quenched) average of the Mattis magnetization of 
the k th pattern retrieved scales as d k (l — d), that is the overlap with the free helpers still available is 
maximized. The overall number of retrieved patterns K therefore corresponds to X)fc=o ^0- ~ d) k = 1, 
with the cut-off at finite H by d(l — d) K > H^ 1 due to discreteness. For any fixed and finite d, this 
implies K < log H, which can be thought of as a "parallel low-storage" regime. In other words, once 
mi has been retrieved, it is energetically convenient for the system to coordinate its free helpers to align 
with another pattern (the one denoted by £ 2 ) instead of letting them align randomly. 



4.2 The case B 



The self-consistencies encoded into Eq. ( 33 ) for the simplest case B = 2 are 



mi(P,d) = d(l - d)tanh(/3mi) + — ^ [tanh[/3(mi +m 2 )] + tanh[/3(mi - m 2 )]] , (34) 

m 2 (/3, d) = d(l - d)tanh(/3m 2 ) + — ^ [tanh[/3(mi +m 2 )] - tanh[/3(mi -m 2 )]]. (35) 

The solution of these equations for different values of /3 is reported in Fig. (4). In the low (fast) noise 
limit (j3 — > oo ), when no dilution is present (d = 0) the second magnetization m 2 disappears and the 
first magnetization mi approaches the value 1 as expected because the Hopficld model is recovered. As 
dilution is increased, mi decreases linearly, while m 2 displays a parabolic profile with peak at d — 0.5. In 
the presence of (fast) noise, m 2 starts growing for higher values of dilution because (as will be cleared by 
the signal-to-noise analysis of the next section) the signaQ insisting on the latter, which is proportional to 
d(l — d), must be higher than the noise level in order to be effective. Also notice that, from intermediate 
dilution onwards, m x and m 2 collapse and the related curves converge at a "bifurcation" point. 
Let us now deepen these results, first from a more intuitive point of view, and later from a more rigorous 
one. 

In the zero (fast) noise limit, let us fix t; 1 as the pattern corresponding to the maximum overlap with 
the magnetic configuration, so that the expected Mattis magnetization is (mi) = (1 — d). The remaining 
H d "free" spins will seek for patterns to align with, namely displaying non-null entries in correspondence 
with the null entries of £ 1 . Actually, due to dilution, one expects that the second best-matching pattern 
only engages H d(l — d) spins, while the remaining H d 2 will match other patterns; in general, the k-th 
best-matching pattern is expected to engage H d k ^ 1 (l — d). 

Such a hierarchical fashion for alignment is more optimal than a uniform alignment of spins amongst 
the available patterns which would yield m^ = d/B for any k and an overall energy —H/2Y^ k {d/B) 2 = 
— (d 2 H)/(2B). Indeed, the hierarchical solution is the one that minimizes the energy (recall that the 
magnetization are summed quadratically) as well as the most likely from a combinatorics point of view, 
providing an overall energy -H/2J2 k [(l ~ d)d k f = -H(l - d 2+2B ){l - d)/[2(l + d)}. 

Therefore, the system is able to perform the "parallel retrieval" of K patterns, whose magnetizations 
are m (1 = {l/H)J2i^ihi, that is (mi) = (1 — d), (m 2 ) = d(l — d), (ttik) = d K (1 — d). It is easy 



3 We use the term "fields" for the forces acting on hi and "channels" 
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to see that it must be d K+1 — 0. Hence, for any finite value of d, an infinite number of patterns can in 
principle be retrieved, i.e. d K — > 0, for K — > oo. More accurately, taking into account the discreteness 
of the system, we have that the last pattern to be retrieved will match only one helper, which yields 
H d K (l — d) = 1, from which K = [\ogH + log(l — d)]/log(l/d) ~ \ogH. In the low storage regime, 
with B finite or scaling logarithmically with H, the retrieval of all patterns can, in principle, always be 
accomplished. 

When noise is also introduced, we have that for the i-th pattern to be retrieved the field felt by helpers 
has to be larger than the noise level, that is [d(l — d) 1 ] > if this condition is not fulfilled the field is 
confused with the noise and the pattern can not be retrieved. 

In the case of large degree of dilution, i.e. d close to 1, patterns are so sparse that not all the H 
helpers can be matched; assuming that patterns get orthogonal, only a fraction B(l — d)/H (= a(l — d) 
or = alogH(l — d)/H in low and high storage regime, respectively) of helpers is aligned with a given 
pattern, the remaining are free and their mean value is zero. In this condition the emergent graph is also 
disconnected. 

Beyond constraints on d, probably the most striking feature displayed by mi,m2 is the bifurcation 
occurring at intermediate values of dilution (see Fig. 4). In order to understand this phenomenon we 
can divide spins into four sets: Si, which contains spins i corresponding to zero entries in both patterns 
(£? = = 0), therefore behaving paramagnetically; S2, which includes spins seeing only one pattern 

I 7^ l£i I); ^3 > which contains spins corresponding to two parallel, non-null entries (£_} — £ 2 7^ 0), 
thus being the most stable; S4, which includes spins i corresponding to two parallel, non-null entries 
(£* = — £| 7^ 0), hence intrinsically frustrated. 

The cardinality of these sets are: |Si| = d 2 , |S 2 | = 2d(l - d), |S 3 | = (1 - d) 2 /2, and |S 4 | = (1 - d) 2 /2. 
Now, the most prone spin to align with the related patterns are those in S3 and in S2, and this requires 
(1 — d) < /3 _1 for the field to get effective. As d is further reduced, mi and m 2 grow paired, due to 
the symmetry of the sets S 2 and S3. The growth proceeds paired until the magnetizations get the value 
ra\ — m% = (1 — d) 2 /2 + d(l — d), where the two contributes come from spins aligned with both patterns 
and with the unique pattern they see, respectively. From this dilution onwards frustrated spins also start 
to align so that one magnetization necessarily prevails over the other. This explanation can be extended 
to any finite B and, in general, the number of sets turns out to be P + 1 + J2k=a Lht^J • 
Now we want to quantify these bifurcation points, and to this task let us call 



mi> 



m 2 



(36) 



We use Eqs. (34) and (35 1 and expand for small values of x 



(mi) - (m 2 ) = x = d(l - d)[tanh(/3(mi)) - tanh(/?(m 2 ))] + (1 - d) 2 tanh (^(m x ) - (m 2 )) , (37) 



where 

d(l-d) [tanh CS(mi)) - tanh (/3(m 2 ))] - d{l-d) 
and 



tanh(/3(mi)) — tanh(/3(m 2 )) 



fix 



cosh (/3(mi)) 



(1 - d) 2 tanh( ( 5(mi) - (m 2 )) - (1 - d) 2 /3x + 0{x 3 ). 



Thus, the leading term is 



d{l-d)P 
cosh 2 (/? (mi)) 



/3(l-rf) ; 



The critical value of j3 corresponding to the bifurcation point is defined as 



Qbif _ 



1 



(l-d) S 



1 



(1-d) 



d coBh 2 (tf'mi) 



(38) 

(39) 
(40) 

(41) 



This mechanism can be easily generalized to the case of multiple patterns. 

We move now to analyze the critical noise level at which the magnetizations disappear and the network 



dynamics becomes ergodic, still in this test-case of two patterns: Expanding expressions (35) we find 

(1-d) 2 

-[/j\iiti/ -r mv'"2/ -r — 

/? 3 



\m 2 ) 



d{l-d)[P(m 2 )} 



+ d(l-d)?-(m 2 ) 



2 

(l-d) 2 
2 



/3(mi 



^(%> + y(K) 3 



{m 2 ) 



3(m 1 ) 2 (m 2 )+3(m 1 )(m 2 ) 2 )] 



((mi) 3 



(m 2 y - 3(mi) 2 (m 2 ) + 3(mi)(m 2 ) 2 ; 
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such that we can write 

(m 2 )~(l-d)/3(m 2 )+0((m 2 ) 3 ). (42) 
Therefore the critical noise level turns out to be 

A = (43) 

This calculation can easily be generalized to several patterns, too. 

4.3 The case B = 3 

When three patterns are considered, the related self-consistent equations that constraint the system to 
parallel processing are the following (we skip the brackets (.) for the sake of clearness): 

ml = d 2 (l - d) tanh[/3ml] - (l/4)d(l - df tanh[/3(-ml - m2)] + (44) 

+ (l/4)d(l - d) 2 tanh[/3(ml - m2)] - (l/4)d(l - d) 2 tanh[(-ml + m2)] + 

+ (l/4)d(l - d) 2 tanh[/3(ml + m2)] - (l/4)d(l - df tanh[/3(-ml - m3)] + 

- (l/4)d(l - d) 2 tanh[/3(ml - m3)] - (1/8) (1 - d) 3 tanh[/3(-ml - m2 - m3)] + 

+ (1/8)(1 - d) 3 tanh[/?(ml - m2 - m3)] - (1/8)(1 - df tanh[/3(-ml + ml - m3)] + 

+ (1/8)(1 - d) 3 tanh[/3(ml + m2 - m3)] - (l/4)d(l - d) 2 tanh[/3(-ml + m3)] + 

+ (l/4)d(l - d) 2 tanh[/3(ml + m3)] - (1/8) (1 - df tanh[/?(-ml - m2 + m3)] + 

+ (1/8)(1 - d) 3 tanh[/3(ml - m2 + m3)] - (1/8)(1 - d) 3 tanh[/3(-ml + m2 + m3)] + 

+ (1/8) (1 - d) 3 tanh[/?(ml + m2 + m3)]] 



ml = -(l/4)d(l-d) 2 tanh[/3(-ml-m2)]-(l/4)d(l-d) 2 tanh[^(ml-m2)] + (45) 

+ d 2 (l - d) tanh[^m2] + (l/4)d(l - df tanh[/3(-ml + m2)] + 

+ (l/4)d(l -d) 2 tanh[/3(ml + m2)] - (l/4)d(l - df tanh[/3(-m2 - m3)] - 

- (1/8)(1 - d) 3 tanh[/3(-ml - m2 - m3)] - (1/8)(1 - d) 3 tanh[/3(ml - m2 - m3)] + 
+ (l/4)d(l - df tanh[/3(m2 - m3)] + (1/8) (1 - df tanh[/?(-ml + m2 - m3)] + 

+ (1/8)(1 - d) 3 tanh[/?(ml + m2 - m3)] - (l/4)d(l - df tanh[/3(-m2 + m3)] - 

- (l/8)(l-d) 3 tanh[/3(-ml-m2 + m3)] - (1/8)(1 - d) 3 tanh[/3(ml - m2 + m3)] + 
+ (l/4)d(l - df tanh[/3(m2 + m3)] + (1/8) (1 - df tanh[£(-ml + m2 + mS)\ + 

+ (1/8) (1 - d) 3 tanh[/?(ml + m2 + m3)]] 
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Figure 6: Parallel retrieval of three strategies. Behavior of three Mattis magnetization versus d in 
the slow (fast) noise limit (i.e. /3 _1 = 10~ 4 ). Continuous lines correspond to numerical solution of 
Eqs. (44|-(46l, while dashed lines correspond to Monte Carlo simulations. 



m3 = -(l/4)d(l - d) 2 tanh[/3(-ml - m3)] - (l/4)d(l - d) 2 tanh[/3(ml - m3)] - (46) 

- (l/4)d(l - d) 2 tanh[/3(-m2 - m3)] - (1/8)(1 - d) 3 tanh[/3(-ml - m2 - m3)] - 

- (1/8) (1 - df tanh[/3(ml - m2 - m3)] - (l/4)d(l - d) 2 tanh[/3(m2 - m3)] - 

- (l/8)(l-d) 3 tanh[/3(-ml + m2-m3)] - (1/8)(1 - d) 3 tanh[/3(ml + m2 - m3)] + 
+ d 2 (l - d) tanh[/?m3] + (l/4)d(l - d f tanh[/3(-ml + m3)] + 

+ (l/4)d(l - d) 2 tanh[/?(ml + m3)] + (l/4)d(l - df tanh[/3(-m2 + m3)] + 

+ (1/8)(1 - d) 3 tanh[/3(-ml - m2 + m3)] + (1/8)(1 - d) 3 tanh[/3(ml - m2 + m3)] + 

+ (l/4)d(l - d) 2 tanh[/3(m2 + m3)] + (1/8) (1 - d) 3 tanh[£(-ml + m2 + m3)] + 

+ (1/8) (1 - d) 3 tanh[/3(ml +m2 + m3)]]. 

Recalling the picture explained in the previous subsection, the magnetizations mi, m2 and m.3 again 
grow together until all helpers corresponding to equal non-null entries and to single non-null entries are 
aligned. Then helpers which are aligned only with two patterns out of three start to feel the field and 
get aligned hence breaking the symmetry. At this point, say mi and m.2, still grow while m 3 decreases. 
The next symmetry-breaking occurs when all helpers corresponding to equal non-null entries t; 1 = £ 2 get 
aligned. From this point onward one magnetization prevails against the other. The same process applies, 
mutatis mutandis, for larger number of patterns (see Fig. 5). 

The last subtlety to be investigated is given by the small discontinuities in the behavior of the magneti- 
zations (see for instance Fig. 6). To explain this feature, let us consider the set of patterns £1, £21 £b an d 
assume the zero fast noise limit (f3 — > 00) for the sake of simplicity, so that we can take \m k \ = (1 — d)d k ~ 1 , 
for k = 1, B as (absolute) Mattis magnetizations. The field insisting on the arbitrary helper hi can be 
written as 

w = ^E - E ™ M - E « E ( 47 ) 

jjti n=l |U=1 M=l 

where in the last passage we dropped the second sum as it is vanishing in the thermodynamic limit. Now, 
let us consider the spin hi, which, again without loss of generality can be thought of as aligned with the 
first pattern and equal to +1. The field insisting on this lymphocyte is ipi = (1 — d)[l + J^ =2 e(l, /i)d M_1 ], 
where e(l,[i) — sgn(^ 1 l , m M ). We notice that, in general, ipi is not positive definite so that the occurrence 
of the condition ipi < would lead to the spin flip hi = 1 — > hi = — 1 and, consequently, to mi < (1 — d). 
In order to understand this effect we focus on e(l,/it). By assumption, mi = (1 — d) and hi = so 
that the first entry of pattern fi = 1 effectively contributes to the related magnetization mi. As for the 
following magnetizations m p> 2, effective contributes can arise only from entries £,j >2 corresponding to 
null entries in £j. Otherwise stated, there is no correlation between and m^ for /1 > 1 (in fact, e(l, /i) is 
zero on average), and one can count the pattern configurations leading to tpi < applying combinatorics. 

Seeking for clarity, we consider the following explicit cases: 
- The probability that the first entries of all patterns /1 > 1 are misaligned with respect to the related 
magnetizations is [(1 - d)/2] B " 1 , hence giving a field ipi = (1 - d)[l - V >2 d ^ X ] = l-2d + d B+1 . Such 
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a field turns out to be negative in the interval a\ < d < 1, where a\ — > 1/2 for B — > oo. 

- The probability that the first entries of all patterns /i > 1 but one, say are misaligned and that 

= is d[(l - d)/2] B -' 2 , and this would lead to ipx(l) = (1 - d) - d(l - d B ) + (1 - d)d l ~ 1 , which is 
negative for a% < d < 1, where ei2 — > 1/2 for £? — > 00; of course is growing with I. 

- The probability that the first entries of all patterns fi > 1 but one, say are misaligned and that 
£1 = 1, is d[(l - d)/2] B - 1 and this configuration yields ipi(l) = (1 - d) - d(l - d B ) + 2(1 - d)d'~ x . For 
instance, when I = 2 and B>1, the field is negative for <i > l/y/2; when Z = 3 the field is negative for 
d > 03, where 03 ps 0.648. 

Summarizing, in the zero noise limit j3 — > 00 for any given dilution d, the probability that m\ < (1 — d) 
can be written as a sum over pattern configurations leading to ipi < 0. For instance, for B = 3, only one 
out of the Z B ~ 1 possible configurations, i.e. sgn(^2,m M ) = sgn(^3,m M ) = —1, can yield a spin-flip: the 
corresponding field is ipi = (1 — d)(l — d — d 2 ), which is negative for d > (\/5 — l)/2 « 0.62 (see Fig. [(jj). 
Therefore, for that value of dilution onwards, mi is reduced with respect to the optimal value (1 — a). 
The extent of the loss is a fraction 1/9 of the total, namely ~ 0.34 (see Fig. [6]). 

Notice that while the change reduces mi , other magnetizations are favored by the spin- flip and undergo 
a proportional increment. Also, the occurrence of a magnetization reduction with respect to the optimal 
value is more likely for the highest magnetization mi, because fields insisting on spins contributing to 
mi are the most complex, being the sum of B — 1 terms. The same discussion can be applied in turns 
to m,2'- now the number of terms which sum up to give the field insisting on the (1 — d)d spins which 
contribute effectively to m-2 is i? — 2, so that there are far less configurations able to yield a negative field. 
Consequently, a loss in m 2 is less likely. Therefore, as long as the number of patterns allows readjustments 
in the value of magnetizations with respect to those expected, the arbitrary m^ may display complex 
corrections (possibly occurring at slightly different values of d) due to the combination of several simple 
corrections, each corresponding to the readjustment affecting the previous magnetizations m M< / c (see 
Fig.§. 



4.4 Signal to noise ratio in the zero fast noise limit 

As usually done in the neural network context [S], we couple the statistical mechanics inspection to 
signal-to-noise analysis. Aim of this procedure is trying to confirm the "parallel ansatz" we implicitly 
made by studying the stability of the basins of attractions (whose fixed points are the learned strategies) 
created in the hierarchical fashion we prescribed. We recall that the model we are investigating describes 
a low storage of information in the associative network so that no slow noise is induced by the underlying 
spin glass, i.e. a — 0. Nonetheless, we study the signal to noise ratio in the zero fast noise limit (/? — > 00) 
as a problem formulated in general terms of a, d; then, we take the limit a — > to get estimate about the 
stability of the basins of attractions (where the presence of fast noise can possibly produce fluctuations) . 

Without loss of generality, we assume that the network is retrieving the first pattern. This means 
that spins are aligned with the non-null entries in the first bit-string £ , while the remaining spins explore 
the other patterns. Thus, for the generic spin hi we can write 

B v-\ 

hi = & + y E&U8(&). (48) 
Accordingly, the local field acting on the i th lymphocyte can be written as 

• In the reference case B = 1, like for the pure states of the Hopfield network, we set 

hi = £ + 6(£)ki, (50) 

where hi is a random variable uniformly distributed on the values ±1 added to ensure that there 
are no nulls entries in the state of the network. Hence we find 



(ifihi)^ — (signal + noise}^ = (signal)^ (51) 
being (noises) £ = 0, and so for large H we have 

{signal) ^ = — ^A(l-d) = (1-d), (52) 
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while 



((noises) 2 )^ = ^-^(1 - df = a(l - df 



• In the test case of two patterns retrieved, B = 2, we set: 

hi = tl + 6(t})[& + 5(&)ki]. 
Now, we need to distinguish between the various possible configurations: 
— Vi such that £f ^ 0, £ 2 = and so that h t = £l ^ for large value of H 

(signal)^ = (1 — d), (noises)^ = 0, 



(53) 



(54) 



((noises) 2 ) ^ 



(g -^- 2) (l-^ = a(l-^ 



/d 2 



Vi such that £?• ^ 0, £ 2 ^ and so that hi = ^ 

(signal)^ = 2(1 — d) — (1 — d) 2 , (noises)^ = 0, 

if = 

(signal)^ = (1 — d) 2 , (noises)^ = 0. 

and in both cases 



(55) 
(56) 

(57) 
(58) 



(ff-l)(B-2) 



((nozse S ) 2 ) c = ^ ^ J -(l - d) 3 + y — =^d(l - d) 2 = a(l - df 



Vi such that £?• = 0, £ 2 ^ and so that = £? ^ 



Id 2 



(signal)^ = d(d — 1), (noises)^ = 0, 

<(— ) 2 ) e = {H ~% B ~ 1] (i - + (g "^ ( f" 2) (i - d) 2 d = a(l - d) 2 . 



(59) 



(60) 



Therefore, in the regime of low storage of strategies we are exploring (a — 0), the retrieval is stable, 
states are well defined and the amplitude of the signal on the first channel is order (1 — d) while on the 
second is of order d(l — d), in perfect agreement with both the statistical mechanics analysis and Monte 
Carlo simulations. 

Once proved that these parallel states exist, it would be interesting trying to understand deeper their 
structure in the configurational space. To this task let us fix a pattern with i = 1,...,H, and a 
dilution d, in such a way that H d of entries are expected to be null and the remaining H(l — d) are 
expected to be half equal to +1 and half equal to —1. The number of helper configurations displaying 
maximum overlap with £ corresponds to the degeneracy induced by null entries, namely 2 Hd ; all these 
configurations lay in an energy minimum because their Mattis magnetization is maximum (actually the 
same holds for the symmetrical configurations due to the gauge symmetry of the model). 

Let us now generalize this discussion by introducing the number of configurations n(m, d) whose 
overlap with the given pattern displays m misalignments in such a way that n(m, d) is given not only by 
the degeneracy induced by null entries, but also by the degeneracy induced by the choice of m entries out 
of H(l — d) which have to be mismatched. It is easy to see that n(m,d) = 2 Hd ( H ^ 1 m d " > ) . Interestingly, 
for such configurations the signal felt by a spin i can be written as ipi = [H((l — d)) — 2m] and the 
effect of the correction due to the m misalignments might be vanishing in the presence of a sufficiently 
large level of noise, so that the system is not restricted to the 2 Hd configurations corresponding to the 
minimum energy, but it can also explore all the configurations n(m,d). 

Therefore, we can count the number of configurations h(x,d) exhibiting a number of misalignments, 
with respect to £ , up to a given threshold x; in the presence of noise such configurations are all accessible, 
namely they all lay in the same "deep" minimum. Indeed, we can write n(x, d) = Y^m=o n ( TO i d); °f course, 
for x = H(l — d) we recover fi(x, d) = 2 H . Moreover, when x = H(l — d)/2, we can exploit the identity 
J2k=o ik) = 1/2[4 J + ( 2l )] [24], and assuming without loss of generality H(l — d) to be even we get 



i(H(l - d)/2, d)=Y^ n(m, d) = 



■>Hd 



m=0 



2-Ff(l-(2) 



H(l-d) 
H(l - d)/2 



2* 
2 



vrid(l - d) 



, (61) 
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Figure 7: (Color on line) Normalized number of accessible configurations n(x,d) as a function of x and 
d for a system made up of H = lymphocytes. The critical line x c = (1 — d), corresponds to the emergence 
of a giant component. 



where in the last passage we used the Stirling approximation given that H{{1 — d)) 3> 1. Then, we have 
h(H(l — d)/2,d) > 1/2, and similar calculations can be drawn for smaller thresholds, e.g., h(H(l — d)/2 — 
M)<l/2. 

As shown in Fig. [TJ once d is fixed, when x is small only a microscopic fraction h(x,d)/2 H of con- 
figuration is accessible (in the thermodynamic limit this fraction is vanishing), while by increasing the 
tolerance x, more and more configuration get accessible and correspondently their fraction gets macro- 
scopic. From a different perspective, each configuration can be looked at as a node of a graph and those 
accessible are connected together. The link probability is then related to x and when x is large enough 
a "giant component" made up of all accessible configurations emerges. This is a percolation process in 
the space of configurations. Indeed, similarly to what happens in canonical percolation processes, the 
curves representing the giant component relevant to different sizes H intersect at around 1/2, which 



distinguishes the percolation threshold x c . According to Eq. 61 we can write x c « H(l — d)/2. 



Interestingly, when a giant component emerges retrieval is no longer meaningful because the system 
may retrieve essentially anything and this corresponds to the critical line (in the d, /3 plane) where all the 
magnetization simultaneously disappear. 



5 Numerics 

In this Section we discuss details on Monte Carlo simulations. 

All the simulations have been performed on a system Ubuntu Linux with Intel Core 17, 3.2Ghz, 12 
CPU, Nvidia-Fermi technology, 12 Gb RAM and OpenMP libraries. The simulations were carried out 
sequentially according to the following algorithm: 

1. Building and storaging of the coupling matrix. 

First, we generate B patterns according to the distribution {d = 0): 

P($) = + (62) 

then, we build a char-matrix Jij — ^ with entries ranging £ [0, 2B + 1] and acting as key 

pointing to another hash-matrix where the H(H — l)/2 real numbers accounting for the Hebb 
interactions (see eq. (16)) are stored. If the amount of patterns do not exceed B — 256, i.e. one 
byte, it is then possible to account for 10 5 helpers with no need of swapping on hard disk (which 
would sensibly affect the performance of the simulation). This condition is fulfilled for the low 
storage regime we are interested in. 

2. Initialize the network status. 

We checked the two standard approaches: The first is to initialize the network in a (assumed) fixed 
point of the dynamics, namely 

hi = & Vie{l,...,H] (63) 
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and check its evolution: This gives important information on the structure of the basins of attrac- 
tion of the minima as we vary the dilution (see Point 5). 



The second approach is to initialize the network randomly: We set hi — 1 with probability 0.5 and 
hi = — 1 otherwise. This is a standard procedure to follow the relaxation to a fixed point with no 
initial assumption and gives important information on the structure of the basins of attraction of 
the minima at fixed dilution. 

3. Evolution dynamics 

The activity of helpers evolves according to a standard (random and sequential) Glauber dynamics 
for Ising-like systems [5]: At each time interval, the state of a lymphocyte is updated according to 
its input signals, where the probability of the unit's activity is equal to a rectified value of the input 
(logit transfer function), i.e. 

Prih^t) = ±1) = — 1 (64) 

1 + exp[=F2/? J%jhj\. 

The field-updating process is managed by a linked list whose parsing is parallelized through OpenMP. 

4. Convergence of the simulation. 

Due to the peculiar structure of the fields induced by pattern dilution (see Fig. 3, right panel), the 
field insisting on a given helper may be zero and the related spin would flip indefinitely. To avoid 
this pathological situation we skip the updating of these "paramagnetic" lymphocytes and focus on 
the remaining ones: In the zero noise limit convergence is almost immediate, such that when the 
whole ensemble of helpers remains unchanged for the whole iV-length of the update cycle, dynamics 
is stopped and the resulting B Mattis magnetizations are printed on a file. 

Relaxation at non-zero noise is checked through the linked list (see next step): The pointer of each 
helper that is aligned with its own field is stored, the ones of helpers with no net fields are removed 
from the linked list, while all the other helpers mismatched to their own fields, are added into the 
linked list. 

5. Increase of B pattern dilution. 

There can be two deeply different ways of increasing dilution. The former is a Bernoullian approach 

and essentially if one starts from a dilution d = 0.45 toward a dilution d = 0.5 essentially may forget 

the starting information and generate a random pattern with on average one half of zero entries; 

the latter is a Markovian dilution by which one needs to start from the previous coupling matrix 

(and patterns) diluted at d = 0.45 and increases dilution on that structure. 

Dilution is tuned at steps of 0.01, ranging from d = to d = 1. 

We take as the state of the network the last equilibrium state, then go to point (3). 

Trough Markovian dilution, we can follow the evolution, varying d, of the pure Hopfield attractonp^| 
Examples of results obtained via numerical simulations can be seen in Fig. 6 and are in general in perfect 
agreement with theory. 



6 Summary and outlooks 

In this paper we continued our modeling of immune networks through statistical mechanics. In particular, 
we recently proposed a model for the adaptive immune response by which helpers and B-cells interact via 
cytokines and are described as a fully-connected bipartite spin glass. We also showed that the model is 
equivalent to an attractor associative network where helpers are able to arrange B-cells and orchestrate 
responses; we underline that this first network was able to elaborate only one strategy at a time, namely, 
helpers managed only one clonal lineage of B-cells before turning to another one. 

Here we introduce dilution in the bipartite spin-glass such that only a fraction of the whole B-repertoire 
interacts with a given helper lineage, which is a much more biological description and we obtain remark- 
able emergent behavior. At first, we have shown that diluting the bipartite B-H spin-glass is very different 
from diluting directly the resulting H-H associative network. In particular, while in the latter the peak 
of the distribution of the fields insisting on the dynamical variables is a monotone decreasing function 

10 Trough this dynamics, the bifurcation tree on dilution is obtained: Note the similarity among the self-consistence 
equations (see e.g. Eq.s (36,37)) and the logistic map. 
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as dilution is increased, the former -our dilution- shows a minimum (as a function of noise level) and a 
high non trivial behavior. This hides in fact a new important property of these networks: They develop, 
as an emerging feature, the ability of retrieval of multiple strategies together; namely helpers are able to 
orchestrate and coordinate the responses of several B-clones at the same time. 

Remarkably, each time the network of helpers retrieves patterns/strategies for (the correct) B-cell expan- 
sions, the corresponding Mattis magnetization is maximized (e.g. each helper belonging to the retrieval 
has the same sign of the cytochine linking it to the B-clone). This means that the field that insists on 
the B-clone, which on average (when no retrieval exists) is zero, becomes ~ (1 — d) > such that each 
retrieval in the helper network generates a "magnetic field" on the B-clone (in the B-H network) and 
consequenly the latter is forced to expand. 

Another interesting remark is that the strength of the field increases as dilution is decreased: This could 
gives hint about the microscopic interpretation of the dilution in terms of reaction-diffusion kinetics of 
lymphocytes and their signalling (on which we plan to report soon), as, for instance, increasing the mean 
square velocity of these elements (e.g. via fever and increased heartbeats), decreases network dilution 
and, consequently, increased the quality and the strength of the signal on the B-cells. 
In particular we studied in detail the case where helpers manage an amount of B-clones proportional to 
the logarithm of the H repertoire, so to say, if the repertoire of helpers is made of by e.g. 10 10 different 
clones (like in humane immune systems), the network is able to manage at the same time O(10) different 
clonal lineage of B-cells. 

This is an important step forward a rationale understanding of these lymphocyte networks because 
the immune system is always supposed to fight several pathogens at each time. 

Turning to technicalities, we studied the model via statistical mechanics solving for the free-energy 
and obtaining through the extremization of the latter, the order parameter self-consistencies, which have 
been explicitly written for the simplest case of two and three parallel retrieved patterns. These equations 
have been hierarchically solved and tested against the results coming from signal-to-noise analysis and 
Monte Carlo simulations: Once showed that the "pure state ansatz" of neural networks in this context 
can no longer minimize the free energy, we introduced a "parallel ansatz" and performed the signal-to- 
noise analysis to confirm its validity by studying the stability of the basin of attraction of the minima it 
generated. Further, we performed Monte Carlo simulations to confirm numerically the whole scenario. 
Overall we found perfect agreement among all the results stemmed from these different techniques. 

Future works, beyond the microscopical interpretation of the tunable parameters, would investigate 
the saturated case, which is still mathematically challenging; then the system in presence of antigens 
(fields) and a detailed refinement discriminating between B-clones (which here are all supposed to share 
the same structural characteristics) with low/high avidity against own tissues with the aim of showing 
further emerging properties as the clonal anergy for high avidity B-clonal lineages resulting in self/non-self 
discrimination problems. 
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